Effective Data Science: Acquiring and Sharing Data
Structured Query Language
SELECT, UPDATE, DELETE, INSERT, WHERESQL is case instenstive. Conventions:
ALLCAPS for SQL verbsTitle Case for table nameslower case for field namesData base management systems
Q: When would you prefer each of these?
Mac OS
Linux
Windows - use git for Windows to get unix-like command line.
survey.db from blackboard.Our command line output could look better:
Try again
Selecting a single field from a table
Selecting multiple fields from a table
Distinct levels
Distinct pairs or touples
Records are not in a fixed order and may change between queries
# ascending order by default
SELECT * FROM Person ORDER BY id;
SELECT * FROM Person ORDER BY id ASC; -- for emphasis
# switch to descending
SELECT * FROM Person ORDER BY id DESC;Q: What do you expect to see?
Filter with WHERE, combine logical conditons (=, >, <, …) with AND and OR.
SELECT person, reading FROM Survey WHERE Person = 'lake';
SELECT person, reading FROM Survey WHERE Person = 'lake' OR Person = 'roe';
SELECT person, reading FROM Survey WHERE Person IN ('lake', 'roe');-- Be explicit about order of operations
SELECT person, reading FROM Survey
WHERE (person = 'lake' OR person = 'roe') AND reading > 10;
SELECT person, reaing FROM Survey
WHERE person = 'lake' OR (person = 'roe' AND reading > 10);Q: Which matches the case with no parentheses?
LIKE and the % wildcard.
Combined, these behave like * in regex.
Q: Is the following true?
Join makes a cross product with 3 * 8 = 24 entries
Tell SQL how to match the tables by specifying a primary key and secondary key
Notice the Table.field synax, this aviods issues with duplicate column names.
{dplyr}{dbplyr}Acquiring and Sharing Data Checklist
Effective Data Science: Acquiring and Sharing Data - SQL - Zak Varty